exponnorm (Exponentially Modified Normal / exGaussian)#

The exponnorm distribution (SciPy’s name) is the distribution of a normal random variable plus an independent exponential delay.

It’s a simple, interpretable model for right-skewed continuous data: symmetric measurement noise (Normal) + one-sided waiting time (Exponential).

What you’ll learn#

  • what exponnorm models and when it’s appropriate

  • the PDF/CDF in closed form (and how SciPy parameterizes it)

  • mean/variance/skewness/kurtosis, plus MGF/characteristic function and entropy notes

  • how parameters change the shape (skew and tail behavior)

  • key derivations: expectation, variance, likelihood

  • NumPy-only sampling via the “Normal + Exponential” construction

  • SciPy usage: scipy.stats.exponnorm (pdf, cdf, rvs, fit)

  • statistical use cases: hypothesis testing, Bayesian modeling patterns, generative modeling

Notebook roadmap#

  1. Title & classification

  2. Intuition & motivation

  3. Formal definition (PDF/CDF)

  4. Moments & properties

  5. Parameter interpretation

  6. Derivations ((\mathbb{E}[X]), (\mathrm{Var}(X)), likelihood)

  7. Sampling & simulation (NumPy-only)

  8. Visualization (PDF, CDF, Monte Carlo)

  9. SciPy integration (scipy.stats.exponnorm)

  10. Statistical use cases

  11. Pitfalls

  12. Summary

import math

import numpy as np
import scipy
from scipy import special, stats
from scipy.optimize import brentq, minimize

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

SEED = 7
rng = np.random.default_rng(SEED)

np.set_printoptions(precision=6, suppress=True)

print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy  1.26.2
scipy  1.15.0
plotly 6.5.2

Prerequisites & notation#

Prerequisites

  • basic probability (PDF/CDF, independence, expectation)

  • comfort with calculus (a convolution integral)

Notation

  • standard normal PDF/CDF: (\phi(\cdot)), (\Phi(\cdot))

  • complementary error function: (\mathrm{erfc}(\cdot))

Parameterizations

A common “exGaussian” construction is:

[ X = N + D,\quad N \sim \mathcal{N}(\mu,\sigma^2),\quad D \sim \mathrm{Exp}(\text{rate}=\lambda),\quad N \perp D. ]

SciPy’s exponnorm uses a dimensionless shape parameter (K) instead of (\lambda):

  • stats.exponnorm(K, loc=μ, scale=σ)

  • mapping: (K = \frac{1}{\sigma\lambda}) (so (\lambda = \frac{1}{K\sigma}))

  • exponential mean (delay scale): (\tau = 1/\lambda = K\sigma)

In the standardized form (loc=0, scale=1): (Y = Z + E) with (Z\sim \mathcal{N}(0,1)) and (E\sim \mathrm{Exp}(\text{rate}=1/K)).

1) Title & Classification#

  • Name: exponnorm (Exponentially modified normal / exGaussian)

  • Type: continuous

  • Support: (x \in \mathbb{R})

  • Parameter space (SciPy):

    • (K > 0) (shape)

    • (\text{loc} \in \mathbb{R}) (location)

    • (\text{scale} > 0) (scale)

A useful generative view (SciPy parameterization):

[ X = \text{loc} + \text{scale},(Z + E), \quad Z \sim \mathcal{N}(0,1), \quad E \sim \mathrm{Exp}(\text{rate}=1/K), \quad Z \perp E. ]

Equivalently, (X = N + D) with (N \sim \mathcal{N}(\text{loc},,\text{scale}^2)) and (D \sim \mathrm{Exp}(\text{rate}=1/(K,\text{scale}))).

2) Intuition & Motivation#

What it models#

exponnorm is a convolution of a normal and an exponential distribution.

Think of an observation as:

  • a symmetric baseline measurement (N) (sensor noise, natural variability), plus

  • a one-sided random delay (D \ge 0) (queueing, reaction-time delay, processing latency).

The exponential part creates a long right tail and positive skew, while the normal part keeps the distribution “normal-like” near its center.

Typical real-world use cases#

  • Reaction times (psychology / neuroscience): decision time + motor delay, often right-skewed.

  • Network / system latency: baseline jitter + occasional queueing delays.

  • Service times in queues: variability around a typical duration plus delay bursts.

  • Chromatography / mass spectrometry: peak shapes with exponential tailing.

Relations to other distributions#

  • (K \to 0) (or (\lambda \to \infty)): the exponential delay vanishes and the distribution approaches a normal.

  • (\sigma \to 0) (keeping (\tau) fixed): the normal collapses and you approach a shifted exponential.

  • Compared to lognormal or gamma, exponnorm has an explicit “noise + delay” interpretation and a very simple sampling story.

3) Formal Definition#

Hierarchical (convolution) definition#

Let (Z \sim \mathcal{N}(0,1)) and (E \sim \mathrm{Exp}(\text{rate}=1/K)) be independent with (K>0). Define the standardized variable (Y = Z + E). Then:

[ Y \sim \mathrm{exponnorm}(K). ]

The general (shifted/scaled) variable is:

[ X = \text{loc} + \text{scale},Y. ]

PDF#

For the standardized form ((\text{loc}=0), (\text{scale}=1)), SciPy uses:

[ f_Y(y;K) = \frac{1}{2K}\exp!\left(\frac{1}{2K^2}-\frac{y}{K}\right), \mathrm{erfc}!\left(\frac{1/K - y}{\sqrt{2}}\right), \qquad y\in\mathbb{R},;K>0. ]

The shifted/scaled PDF is obtained by the change of variables (y=(x-\text{loc})/\text{scale}):

[ f_X(x;K,\text{loc},\text{scale}) = \frac{1}{\text{scale}}, f_Y!\left(\frac{x-\text{loc}}{\text{scale}};K\right). ]

CDF#

A convenient closed form in terms of the standard normal CDF (\Phi) is:

[ F_Y(y;K) = \Phi(y)

  • \exp!\left(\frac{1}{2K^2}-\frac{y}{K}\right), \Phi!\left(y-\frac{1}{K}\right). ]

Again, (F_X(x) = F_Y!\big((x-\text{loc})/\text{scale}\big)).

We’ll implement these formulas (carefully, in log-space where helpful) and check them against SciPy.

def exgauss_to_scipy(mu: float, sigma: float, lam: float) -> tuple[float, float, float]:
    """Map (μ, σ, λ) exGaussian parameters to SciPy's (K, loc, scale)."""

    if sigma <= 0:
        raise ValueError("sigma must be > 0")
    if lam <= 0:
        raise ValueError("lam must be > 0")

    K = 1.0 / (sigma * lam)
    return float(K), float(mu), float(sigma)


def scipy_to_exgauss(K: float, loc: float, scale: float) -> tuple[float, float, float, float]:
    """Map SciPy's (K, loc, scale) to (μ, σ, λ, τ) where τ=1/λ is the exp mean."""

    if K <= 0:
        raise ValueError("K must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    lam = 1.0 / (K * scale)
    tau = 1.0 / lam
    return float(loc), float(scale), float(lam), float(tau)


def exponnorm_logpdf_standard(y: np.ndarray, K: float) -> np.ndarray:
    """Log-PDF of the standardized exponnorm (loc=0, scale=1).

    Formula (SciPy):
        f(y) = 1/(2K) * exp(1/(2K^2) - y/K) * erfc((1/K - y)/sqrt(2))

    We evaluate it stably via erfcx(z) = exp(z^2) * erfc(z).
    """

    y = np.asarray(y, dtype=float)
    if K <= 0:
        raise ValueError("K must be > 0")

    z = (1.0 / K - y) / np.sqrt(2.0)

    return (
        -np.log(2.0 * K)
        + (1.0 / (2.0 * K * K))
        - y / K
        - z * z
        + np.log(special.erfcx(z))
    )


def exponnorm_logpdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """Log-PDF of exponnorm(K, loc, scale) using the standardized log-PDF."""

    x = np.asarray(x, dtype=float)
    if scale <= 0:
        raise ValueError("scale must be > 0")

    y = (x - loc) / scale
    return exponnorm_logpdf_standard(y, K) - np.log(scale)


def exponnorm_pdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    return np.exp(exponnorm_logpdf(x, K, loc=loc, scale=scale))


def exponnorm_cdf_standard(y: np.ndarray, K: float) -> np.ndarray:
    """CDF of the standardized exponnorm.

    Closed form:
        F(y) = Φ(y) - exp(1/(2K^2) - y/K) * Φ(y - 1/K)

    We evaluate the second term using logcdf for stability.
    """

    y = np.asarray(y, dtype=float)
    if K <= 0:
        raise ValueError("K must be > 0")

    term1 = stats.norm.cdf(y)
    log_term2 = (1.0 / (2.0 * K * K)) - y / K + stats.norm.logcdf(y - 1.0 / K)
    term2 = np.exp(log_term2)
    return term1 - term2


def exponnorm_cdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    x = np.asarray(x, dtype=float)
    if scale <= 0:
        raise ValueError("scale must be > 0")

    y = (x - loc) / scale
    return exponnorm_cdf_standard(y, K)


def exponnorm_stats_closed_form(
    K: float,
    loc: float = 0.0,
    scale: float = 1.0,
) -> tuple[float, float, float, float]:
    """Return (mean, var, skew, excess_kurtosis)."""

    if K <= 0:
        raise ValueError("K must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    mean = loc + scale * K
    var = (scale * scale) * (1.0 + K * K)
    skew = 2.0 * (K**3) / ((1.0 + K * K) ** 1.5)
    excess_kurt = 6.0 * (K**4) / ((1.0 + K * K) ** 2.0)
    return float(mean), float(var), float(skew), float(excess_kurt)


def exponnorm_mgf(t: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """MGF M_X(t) = E[exp(tX)]. Exists only for t < 1/(K*scale)."""

    t = np.asarray(t, dtype=float)
    if K <= 0:
        raise ValueError("K must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    denom = 1.0 - K * scale * t
    if np.any(denom <= 0):
        raise ValueError("MGF exists only for t < 1/(K*scale).")

    return np.exp(loc * t + 0.5 * (scale * t) ** 2) / denom


def exponnorm_cf(t: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """Characteristic function φ_X(t) = E[exp(i t X)]."""

    t = np.asarray(t, dtype=float)
    if K <= 0:
        raise ValueError("K must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    denom = 1.0 - 1j * K * scale * t
    return np.exp(1j * loc * t - 0.5 * (scale * t) ** 2) / denom


def sample_exponnorm_numpy(
    n: int,
    K: float,
    rng: np.random.Generator,
    loc: float = 0.0,
    scale: float = 1.0,
) -> np.ndarray:
    """NumPy-only sampling via X = loc + scale * (Z + E).

    Z ~ N(0,1)
    E ~ Exp(rate=1/K)  (mean K)
    """

    if n <= 0:
        raise ValueError("n must be positive")
    if K <= 0:
        raise ValueError("K must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    z = rng.normal(size=n)
    e = rng.exponential(scale=K, size=n)
    return loc + scale * (z + e)


def sample_moments(x: np.ndarray) -> tuple[float, float, float, float]:
    """Return (mean, var, skew, excess_kurtosis) using population moments (ddof=0)."""

    x = np.asarray(x, dtype=float)
    m = float(np.mean(x))
    v = float(np.var(x, ddof=0))
    s = float(np.sqrt(v))
    if s == 0:
        return m, v, np.nan, np.nan

    z = (x - m) / s
    skew = float(np.mean(z**3))
    excess_kurt = float(np.mean(z**4) - 3.0)
    return m, v, skew, excess_kurt


def K_from_skewness(gamma1: float, eps: float = 1e-10) -> float:
    """Invert gamma1 = 2 K^3 / (1 + K^2)^(3/2) for K>0.

    Theoretical range: gamma1 ∈ [0, 2). We clip slightly below 2 for stability.
    """

    gamma1 = float(np.clip(gamma1, 0.0, 2.0 - 1e-12))
    if gamma1 < eps:
        return eps

    def f(K: float) -> float:
        return 2.0 * (K**3) / ((1.0 + K * K) ** 1.5) - gamma1

    return float(brentq(f, eps, 1e6))


def exponnorm_mom_initial_guess(x: np.ndarray) -> tuple[float, float, float]:
    """Method-of-moments initial guess (K, loc, scale) from mean/var/skew."""

    m, v, skew, _ = sample_moments(x)
    K0 = K_from_skewness(skew if np.isfinite(skew) else 0.0)

    scale0 = float(np.sqrt(v / (1.0 + K0 * K0)))
    loc0 = float(m - scale0 * K0)
    return float(K0), float(loc0), float(scale0)


def x_grid_for_plotting(rv, q_low: float = 1e-3, q_high: float = 0.999, n: int = 600) -> np.ndarray:
    """Choose a reasonable finite x-grid for plotting a continuous distribution."""

    lo = float(rv.ppf(q_low))
    hi = float(rv.ppf(q_high))

    if not (np.isfinite(lo) and np.isfinite(hi) and hi > lo):
        mean, var = rv.stats(moments="mv")
        mean = float(mean)
        var = float(var)
        if np.isfinite(mean) and np.isfinite(var) and var > 0:
            std = float(np.sqrt(var))
            lo = mean - 6 * std
            hi = mean + 10 * std  # skewed right: give more room on the right
        else:
            lo, hi = -5.0, 15.0

    return np.linspace(lo, hi, n)


def empirical_cdf(samples: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
    samples = np.asarray(samples, dtype=float)
    xs = np.sort(samples)
    n = xs.size
    ys = np.arange(1, n + 1) / n
    return xs, ys
# Quick correctness check vs SciPy (PDF/CDF + moments)

K, loc, scale = 1.5, -0.2, 0.8
rv = stats.exponnorm(K, loc=loc, scale=scale)

x = x_grid_for_plotting(rv, q_low=1e-4, q_high=0.9999, n=500)

pdf_err = float(np.max(np.abs(exponnorm_pdf(x, K, loc=loc, scale=scale) - rv.pdf(x))))
cdf_err = float(np.max(np.abs(exponnorm_cdf(x, K, loc=loc, scale=scale) - rv.cdf(x))))

mean_cf, var_cf, skew_cf, kurt_cf = exponnorm_stats_closed_form(K, loc=loc, scale=scale)
mean_sp, var_sp, skew_sp, kurt_sp = rv.stats(moments="mvsk")

{
    "max|pdf - SciPy|": pdf_err,
    "max|cdf - SciPy|": cdf_err,
    "mean (closed, SciPy)": (mean_cf, float(mean_sp)),
    "var  (closed, SciPy)": (var_cf, float(var_sp)),
    "skew (closed, SciPy)": (skew_cf, float(skew_sp)),
    "kurt_excess (closed, SciPy)": (kurt_cf, float(kurt_sp)),
}
{'max|pdf - SciPy|': 1.942890293094024e-16,
 'max|cdf - SciPy|': 2.220446049250313e-16,
 'mean (closed, SciPy)': (1.0000000000000002, 1.0000000000000002),
 'var  (closed, SciPy)': (2.0800000000000005, 2.08),
 'skew (closed, SciPy)': (1.1520696383139373, 1.1520696383139375),
 'kurt_excess (closed, SciPy)': (2.8757396449704142, 2.8757396449704142)}

4) Moments & Properties#

Mean, variance, skewness, kurtosis#

Using the additive construction (X = N + D) (normal + exponential), moments follow cleanly.

For SciPy’s parameters ((K,\text{loc},\text{scale})):

[ \mathbb{E}[X] = \text{loc} + K,\text{scale} ]

[ \mathrm{Var}(X) = \text{scale}^2,(1+K^2) ]

Skewness and excess kurtosis depend only on (K) (location/scale don’t change these shape measures):

[ \gamma_1 = \frac{2K^3}{(1+K^2)^{3/2}}, \qquad \gamma_2 = \frac{6K^4}{(1+K^2)^2}. ]

As (K\to 0), (\gamma_1\to 0) and you approach a normal; as (K\to \infty), (\gamma_1\to 2) and you approach an exponential-like skew.

MGF / characteristic function#

Because (X) is a sum of independent variables, MGFs multiply.

For (X\sim\mathrm{exponnorm}(K,\text{loc},\text{scale})):

[ M_X(t) = \mathbb{E}[e^{tX}] = \frac{\exp!\left(\text{loc},t + \tfrac{1}{2}(\text{scale},t)^2\right)}{1 - K,\text{scale},t}, \qquad t < \frac{1}{K,\text{scale}}. ]

[ \varphi_X(t) = \mathbb{E}[e^{itX}] = \frac{\exp!\left(i,\text{loc},t - \tfrac{1}{2}(\text{scale},t)^2\right)}{1 - iK,\text{scale},t}. ]

Entropy#

The differential entropy (h(X) = -\mathbb{E}[\log f_X(X)]) does not have a simple elementary closed form.

In practice you compute it numerically (SciPy provides entropy), and scaling behaves as usual: (h(\text{loc}+\text{scale},Y)=h(Y)+\log(\text{scale})).

# Moment checks + MGF/CF + entropy sanity checks

K, loc, scale = 1.2, -0.5, 0.8
rv = stats.exponnorm(K, loc=loc, scale=scale)

# Closed form vs SciPy
closed = exponnorm_stats_closed_form(K, loc=loc, scale=scale)
scipy_stats = tuple(float(v) for v in rv.stats(moments="mvsk"))

print("(mean, var, skew, kurt_excess)")
print("closed:", closed)
print("scipy :", scipy_stats)

# Monte Carlo checks
n = 200_000
samples = rv.rvs(size=n, random_state=rng)
mc = sample_moments(samples)

print("mc   :", mc)

# MGF at a safe t
# Domain: t < 1/(K*scale)
t = 0.4
mgf_theory = float(exponnorm_mgf(t, K, loc=loc, scale=scale))
mgf_mc = float(np.mean(np.exp(t * samples)))

# Characteristic function at a frequency
w = 1.0
cf_theory = exponnorm_cf(w, K, loc=loc, scale=scale)
cf_mc = np.mean(np.exp(1j * w * samples))

print("mgf theory vs mc:", mgf_theory, mgf_mc)
print("cf  theory vs mc:", cf_theory, cf_mc)

# Entropy: SciPy vs Monte Carlo estimate -E[log f(X)]
h_scipy = float(rv.entropy())
h_mc = float(-np.mean(rv.logpdf(samples)))

print("entropy scipy vs mc:", h_scipy, h_mc)
(mean, var, skew, kurt_excess)
closed: (0.45999999999999996, 1.5616000000000003, 0.9067529857542795, 2.0897608169846813)
scipy : (0.45999999999999996, 1.5616, 0.9067529857542795, 2.0897608169846817)
mc   : (0.4597599350100935, 1.5543152074902815, 0.9030605189927701, 2.0771188553706823)
mgf theory vs mc: 1.3989309187572552 1.3974357979472805
cf  theory vs mc: (0.5055499322164319+0.1371935417217969j) (0.5057498583668312+0.13764341874813715j)
entropy scipy vs mc: 1.5894958483983916 1.5875920033245619

5) Parameter Interpretation#

Shape parameter (K)#

  • (K) is dimensionless: it’s the exponential mean measured in units of the normal standard deviation.

  • In the ((\mu,\sigma,\lambda)) parameterization: (K = 1/(\sigma\lambda) = \tau/\sigma).

Intuition:

  • small (K): the exponential delay is tiny relative to the normal noise (\Rightarrow) nearly symmetric (almost normal)

  • large (K): the delay dominates (\Rightarrow) strong right tail and skew

loc and scale#

  • loc shifts the distribution left/right.

  • scale stretches the distribution and also scales the exponential delay (since the exponential component is multiplied by scale).

Mean/variance reminders: (\mathbb{E}[X]=\text{loc}+K,\text{scale}), (\mathrm{Var}(X)=\text{scale}^2(1+K^2)).

Shape changes#

Below we vary (K) (keeping the mean fixed) and vary scale to see how the PDF/CDF morph.

# Shape changes as K varies (keep mean fixed at 0 by setting loc = -K when scale=1)

Ks = [0.1, 0.5, 1.0, 2.0]
rvs = [stats.exponnorm(K, loc=-K, scale=1.0) for K in Ks]

lo = min(float(rv.ppf(0.001)) for rv in rvs)
hi = max(float(rv.ppf(0.999)) for rv in rvs)
x = np.linspace(lo, hi, 700)

fig = make_subplots(rows=1, cols=2, subplot_titles=("PDF", "CDF"))

for K, rv in zip(Ks, rvs):
    fig.add_trace(go.Scatter(x=x, y=rv.pdf(x), mode="lines", name=f"K={K:g}"), row=1, col=1)
    fig.add_trace(go.Scatter(x=x, y=rv.cdf(x), mode="lines", showlegend=False), row=1, col=2)

fig.add_vline(x=0.0, line_dash="dot", opacity=0.35)
fig.update_layout(title="exponnorm: varying K (mean aligned to 0)")
fig.update_xaxes(title_text="x", row=1, col=1)
fig.update_xaxes(title_text="x", row=1, col=2)
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="cdf", row=1, col=2)
fig.show()


# Scale changes (keep mean fixed at 0 by setting loc = -scale*K)
K = 1.0
scales = [0.5, 1.0, 2.0]
rvs = [stats.exponnorm(K, loc=-(s * K), scale=s) for s in scales]

lo = min(float(rv.ppf(0.001)) for rv in rvs)
hi = max(float(rv.ppf(0.999)) for rv in rvs)
x = np.linspace(lo, hi, 700)

fig = go.Figure()
for s, rv in zip(scales, rvs):
    fig.add_trace(go.Scatter(x=x, y=rv.pdf(x), mode="lines", name=f"scale={s:g}"))

fig.add_vline(x=0.0, line_dash="dot", opacity=0.35)
fig.update_layout(
    title="exponnorm: varying scale (mean aligned to 0, K fixed)",
    xaxis_title="x",
    yaxis_title="density",
)
fig.show()

6) Derivations#

6.1 Expectation#

Using the exGaussian sum representation:

[ X = N + D, \quad N\sim\mathcal{N}(\mu,\sigma^2), \quad D\sim\mathrm{Exp}(\text{rate}=\lambda), \quad N\perp D. ]

Linearity of expectation gives:

[ \mathbb{E}[X] = \mathbb{E}[N] + \mathbb{E}[D] = \mu + \frac{1}{\lambda}. ]

Mapping to SciPy, (\mu=\text{loc}) and (1/\lambda=\tau = K,\text{scale}), so (\mathbb{E}[X]=\text{loc}+K,\text{scale}).

6.2 Variance#

Independence implies variances add:

[ \mathrm{Var}(X) = \mathrm{Var}(N) + \mathrm{Var}(D) = \sigma^2 + \frac{1}{\lambda^2}. ]

Under SciPy’s parameterization, (\sigma=\text{scale}) and (1/\lambda=K,\text{scale}), hence (\mathrm{Var}(X)=\text{scale}^2(1+K^2)).

6.3 Likelihood (for data)#

Given i.i.d. observations (x_1,\dots,x_n), the likelihood is

[ L(K,\text{loc},\text{scale}) = \prod_{i=1}^n f_X(x_i;K,\text{loc},\text{scale}). ]

The log-likelihood is

[ \ell = \sum_{i=1}^n \log f_X(x_i;K,\text{loc},\text{scale}) = -n\log(\text{scale}) + \sum_{i=1}^n \log f_Y!\left(\frac{x_i-\text{loc}}{\text{scale}};K\right), ]

where (f_Y) is the standardized density.

There is no simple closed-form MLE; numerical optimization (or scipy.stats.exponnorm.fit) is used.

# Likelihood + MLE demo (SciPy fit vs a simple custom optimizer)

# Simulate data from a known parameter set
K_true, loc_true, scale_true = 1.3, -0.5, 0.8
n = 3_000
x = sample_exponnorm_numpy(n, K=K_true, loc=loc_true, scale=scale_true, rng=rng)

# SciPy MLE
K_hat_scipy, loc_hat_scipy, scale_hat_scipy = stats.exponnorm.fit(x)


def loglik(x: np.ndarray, K: float, loc: float, scale: float) -> float:
    return float(np.sum(exponnorm_logpdf(x, K, loc=loc, scale=scale)))


def fit_exponnorm_mle(x: np.ndarray) -> tuple[float, float, float]:
    x = np.asarray(x, dtype=float)

    K0, loc0, scale0 = exponnorm_mom_initial_guess(x)
    theta0 = np.array([math.log(K0), loc0, math.log(scale0)], dtype=float)

    def nll(theta: np.ndarray) -> float:
        logK, loc, logscale = float(theta[0]), float(theta[1]), float(theta[2])
        K = float(np.exp(logK))
        scale = float(np.exp(logscale))
        if not (np.isfinite(K) and np.isfinite(loc) and np.isfinite(scale) and scale > 0):
            return np.inf
        return -loglik(x, K, loc=loc, scale=scale)

    res = minimize(nll, x0=theta0, method="Nelder-Mead")
    logK_hat, loc_hat, logscale_hat = (float(v) for v in res.x)
    return float(np.exp(logK_hat)), float(loc_hat), float(np.exp(logscale_hat))


K_hat_opt, loc_hat_opt, scale_hat_opt = fit_exponnorm_mle(x)

print("true :", (K_true, loc_true, scale_true))
print("scipy:", (float(K_hat_scipy), float(loc_hat_scipy), float(scale_hat_scipy)))
print("opt  :", (K_hat_opt, loc_hat_opt, scale_hat_opt))

print("loglik true :", loglik(x, K_true, loc_true, scale_true))
print("loglik scipy:", loglik(x, float(K_hat_scipy), float(loc_hat_scipy), float(scale_hat_scipy)))
print("loglik opt  :", loglik(x, K_hat_opt, loc_hat_opt, scale_hat_opt))
true : (1.3, -0.5, 0.8)
scipy: (1.343323830978064, -0.5385454706669304, 0.8036398274203335)
opt  : (1.343429820268832, -0.538601664756589, 0.8036168180048361)
loglik true : -4950.712560390255
loglik scipy: -4949.376344116159
loglik opt  : -4949.376344452656

7) Sampling & Simulation (NumPy-only)#

Algorithm (Normal + Exponential)#

Use the defining construction:

  1. Sample (Z \sim \mathcal{N}(0,1)).

  2. Sample (E \sim \mathrm{Exp}(\text{rate}=1/K)) (mean (K)).

  3. Set (Y = Z + E).

  4. Return (X = \text{loc} + \text{scale},Y).

This is efficient and numerically stable because it uses only well-behaved base RNGs.

# Monte Carlo sanity check of the NumPy-only sampler

K, loc, scale = 0.9, -0.3, 1.1
n = 200_000
samples = sample_exponnorm_numpy(n, K=K, loc=loc, scale=scale, rng=rng)

mc = sample_moments(samples)
closed = exponnorm_stats_closed_form(K, loc=loc, scale=scale)

print("(mean, var, skew, kurt_excess)")
print("mc   :", mc)
print("closed:", closed)
(mean, var, skew, kurt_excess)
mc   : (0.6940026389829959, 2.190001974195752, 0.5937727858392163, 1.170714839086485)
closed: (0.6900000000000002, 2.1901000000000006, 0.5987419144908114, 1.201611672415372)

8) Visualization#

We’ll visualize:

  • the PDF and CDF

  • Monte Carlo samples (histogram)

  • empirical CDF vs the theoretical CDF

# PDF/CDF and Monte Carlo comparison

K, loc, scale = 0.9, -0.3, 1.1
rv = stats.exponnorm(K, loc=loc, scale=scale)

n = 80_000
samples = sample_exponnorm_numpy(n, K=K, loc=loc, scale=scale, rng=rng)

x_grid = x_grid_for_plotting(rv, q_low=0.001, q_high=0.999, n=600)

# Histogram + PDF
fig = px.histogram(
    samples,
    nbins=80,
    histnorm="probability density",
    title=f"Monte Carlo samples vs PDF (n={n:,}, K={K:g}, loc={loc:g}, scale={scale:g})",
    labels={"value": "x"},
)
fig.add_trace(go.Scatter(x=x_grid, y=rv.pdf(x_grid), mode="lines", name="SciPy PDF"))
fig.update_layout(yaxis_title="density")
fig.show()


# Empirical CDF vs theoretical CDF
xs, ys = empirical_cdf(samples)

x_grid = np.linspace(float(np.quantile(xs, 0.001)), float(np.quantile(xs, 0.999)), 600)

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"))
fig.add_trace(go.Scatter(x=x_grid, y=rv.cdf(x_grid), mode="lines", name="theoretical CDF"))
fig.update_layout(
    title=f"Empirical CDF vs theoretical CDF (n={n:,})",
    xaxis_title="x",
    yaxis_title="F(x)",
)
fig.show()

9) SciPy Integration#

SciPy implements this distribution as scipy.stats.exponnorm.

Useful methods:

  • pdf, logpdf

  • cdf, logcdf

  • sf, logsf (often more accurate in the tail)

  • rvs for sampling

  • fit for MLE parameter estimation

About fit

  • exponnorm.fit(data) estimates ((K,\text{loc},\text{scale})) via maximum likelihood.

  • As (K\to 0), the distribution approaches a normal and the MLE can become numerically delicate (boundary effects).

  • For more control, you can fix parameters (e.g., floc=...) or fit via a custom optimizer on logpdf.

# Basic SciPy usage + fit demo

K, loc, scale = 1.1, -0.4, 0.9
rv = stats.exponnorm(K, loc=loc, scale=scale)

xs = np.array([-2.0, -1.0, 0.0, 1.0, 2.0])
print("pdf:", rv.pdf(xs))
print("cdf:", rv.cdf(xs))

# Sampling via SciPy
samples_scipy = rv.rvs(size=5_000, random_state=rng)

# Fit parameters back
K_hat, loc_hat, scale_hat = stats.exponnorm.fit(samples_scipy)
print("fit (K, loc, scale):", (float(K_hat), float(loc_hat), float(scale_hat)))
pdf: [0.027719 0.161068 0.327328 0.275103 0.129876]
cdf: [0.010279 0.093035 0.347585 0.667741 0.867593]
fit (K, loc, scale): (1.1180461380307114, -0.4026713345023387, 0.8899021130108721)

10) Statistical Use Cases#

10.1 Hypothesis testing / anomaly detection#

If you have a baseline model (X\sim\mathrm{exponnorm}(K,\text{loc},\text{scale})), then an unusually large observation (x_\text{obs}) can be scored by a right-tail probability:

[ \text{p-value} = \Pr(X \ge x_\text{obs}) = \mathrm{sf}(x_\text{obs}). ]

For model comparison (e.g., “is a normal enough?”), likelihood-ratio statistics are natural — but because (K\ge 0) and the normal corresponds to the boundary case (K=0), a parametric bootstrap is a safer way to calibrate the test than relying on a (\chi^2) reference.

10.2 Bayesian modeling#

A Bayesian model uses the same latent decomposition:

[ x_i = \mu + \sigma z_i + d_i,\quad z_i\sim\mathcal{N}(0,1),\quad d_i\sim\mathrm{Exp}(\lambda). ]

Choose priors like:

  • (\mu\sim\mathcal{N}(m_0,s_0^2))

  • (\sigma\sim\mathrm{HalfNormal}(s))

  • (\lambda\sim\mathrm{Gamma}(\alpha,\beta))

This representation is convenient in probabilistic programming systems (Stan, PyMC, Turing).

10.3 Generative modeling#

exponnorm is a useful component distribution for skewed data:

  • mixture models (mixture of exGaussians)

  • emission models in HMMs for duration-like observations

  • synthetic data generation for latency / response-time benchmarks

# 10.1 Tail probability (anomaly scoring) + bootstrap LRT vs Normal

# Tail probability under a baseline model
K0, loc0, scale0 = 0.9, -0.3, 1.1
rv0 = stats.exponnorm(K0, loc=loc0, scale=scale0)

x_obs = float(rv0.ppf(0.995))
p_value = float(rv0.sf(x_obs))  # P(X >= x_obs)

print("x_obs:", x_obs)
print("p_value (right tail):", p_value)


# Model comparison: ExponNorm vs Normal via parametric bootstrap
n = 400
x = rv0.rvs(size=n, random_state=rng)

# Fit Normal (2 params)
mu_hat, sigma_hat = stats.norm.fit(x)
ll_norm = float(np.sum(stats.norm.logpdf(x, loc=mu_hat, scale=sigma_hat)))

# Fit ExponNorm (3 params)
K_hat, loc_hat, scale_hat = stats.exponnorm.fit(x)
ll_expn = float(np.sum(stats.exponnorm.logpdf(x, K_hat, loc=loc_hat, scale=scale_hat)))

T_obs = 2.0 * (ll_expn - ll_norm)

B = 30  # increase for a more stable p-value
T_boot = []
for _ in range(B):
    xb = stats.norm.rvs(loc=mu_hat, scale=sigma_hat, size=n, random_state=rng)

    mu_b, sigma_b = stats.norm.fit(xb)
    ll_n_b = float(np.sum(stats.norm.logpdf(xb, loc=mu_b, scale=sigma_b)))

    K_b, loc_b, scale_b = stats.exponnorm.fit(xb)
    ll_e_b = float(np.sum(stats.exponnorm.logpdf(xb, K_b, loc=loc_b, scale=scale_b)))

    T_boot.append(2.0 * (ll_e_b - ll_n_b))

p_boot = float(np.mean(np.asarray(T_boot) >= T_obs))

print("T_obs:", T_obs)
print("bootstrap p-value (Normal is enough?):", p_boot)


# 10.3 Simple mixture of two exponnorm components (generative modeling)

w = 0.6
n = 60_000
n1 = int(w * n)
n2 = n - n1

x1 = sample_exponnorm_numpy(n1, K=0.4, loc=-1.0, scale=0.7, rng=rng)
x2 = sample_exponnorm_numpy(n2, K=1.5, loc=1.0, scale=0.9, rng=rng)

x_mix = np.concatenate([x1, x2])
rng.shuffle(x_mix)

fig = px.histogram(
    x_mix,
    nbins=90,
    histnorm="probability density",
    title="Mixture of two `exponnorm` components",
    labels={"value": "x"},
)
fig.show()
x_obs: 5.556442873253898
p_value (right tail): 0.004999999999999975
T_obs: 16.563440711130625
bootstrap p-value (Normal is enough?): 0.0

11) Pitfalls#

  • Invalid parameters: SciPy requires K > 0 and scale > 0.

  • Parameterization confusion: many references use ((\mu,\sigma,\lambda)); SciPy uses (K=1/(\sigma\lambda)). Track whether you mean rate (\lambda) or mean delay (\tau=1/\lambda).

  • Right-skew only: exponnorm produces positive skew. If your data is left-skewed, consider reflecting the data or using a different family.

  • Fitting near (K\approx 0): the normal is a boundary case; optimization can be sensitive and standard LRT asymptotics may fail.

  • Tail numerics: for extreme (x), 1 - cdf(x) can suffer catastrophic cancellation. Prefer sf / logsf, and prefer logpdf over pdf in the far tail.

# Numerical pitfall demo: 1 - cdf vs sf in the far right tail

K, loc, scale = 1.2, 0.0, 1.0
rv = stats.exponnorm(K, loc=loc, scale=scale)

x = 50.0
sf_naive = 1.0 - rv.cdf(x)
sf_stable = rv.sf(x)
logsf = rv.logsf(x)

{"1-cdf": sf_naive, "sf": float(sf_stable), "logsf": float(logsf)}
{'1-cdf': 0.0, 'sf': 1.1355160639014264e-18, 'logsf': -41.31944444444445}

12) Summary#

  • exponnorm is a continuous distribution on (\mathbb{R}) modeling Normal + Exponential (noise + delay).

  • SciPy parameters: (K>0), loc, scale; mean (= \text{loc} + K,\text{scale}), variance (= \text{scale}^2(1+K^2)).

  • Shape depends only on (K): skewness (\gamma_1 = 2K^3/(1+K^2)^{3/2}), excess kurtosis (\gamma_2 = 6K^4/(1+K^2)^2).

  • Sampling is simple with NumPy: normal + exponential, then an affine transform.

  • In practice, prefer logpdf, logcdf, and sf/logsf for tail computations, and treat fit near (K\approx 0) with care.

References#

  • SciPy docstring: scipy.stats.exponnorm

  • Exponentially modified Gaussian distribution (Wikipedia)